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Abstract 

We study a class of random 3-SAT instances having exactly one solution. The prop- 
erties of this ensemble considerably differ from those of a random 3-SAT ensemble. 
It is numerically shown that the running time of several complete and stochastic 
local search algorithms monotonically increases as the clause density is decreased. 
Therefore, there is no easy-hard-easy pattern of hardness as for standard random 
3-SAT ensemble. Furthermore, the running time for short single-solution formulas 
increases with the problem size much faster than for random 3-SAT formulas from 
the phase transition region. 



1 Introduction 

The propositional satisfiability problem is one of the most studied problems 
in computer science. The most prominent one is the 3-satisfiability (3-SAT) 
problem. It consists of determining if there exists an assignment of truth val- 
ues to a set of boolean variables such that a given 3-SAT formula is satisfied. 
A 3-SAT formula involving n variables is a conjunction (logical AND) of m 
clauses, each clause being a disjunction (logical OR) of 3 literals (a literal is 
a variable or its negation). 3-SAT problem is important from a theoretical as 
well as from a practical point of view. On the theoretical side, it is a paradig- 
matic example of a NP-complete (NPC) problem. Historically, it was the first 
problem to be shown by Cook [1] to be NPC. The algorithmic complexity of 
3-SAT problem is connected to various computational complexity issues, most 
notably to the famous "P=NP?" question which is one of the most important 
unsolved problems in mathematics and computer science [2] . On the practical 
side, 3-SAT solving algorithms are used in the industry. Because any circuit 
involving logical operations can be converted to a 3-SAT formula they can 
also be used for verification of microprocessors [3,4]. 3-SAT solving can also 
be related to deductive reasoning: given a set of facts (statements) 0, a new 
statement C can be deduced if a union U {->C} is not satisfiable, i.e. we 
arrive at the contradiction assuming the negation ->C. 
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Having hard 1 3-SAT instances at hand is important for several reasons. 
First, it might help in understanding what makes 3-SAT problems, and gen- 
erally NPC ones, so difficult at all. Second, hard instances are also actively 
sought for algorithm testing, enabling the design of better algorithms. There 
are basically two classes of 3-SAT instances used in testing. Those coming 
from the real world applications mentioned above and artificially generated 
ones that are thought to be hard. For the later ones one usually uses the 
so-called uniform random 3-SAT ensemble. An important discovery was that 
among random 3-SAT instances hard ones are found around the phase tran- 
sition [5,6,7,8,9,10], where the average formula changes from being satisfiable 
to being unsatisfiable. Connected to the phase transition phenomenon, it is 
believed that for NPC problems one typically has a pattern of "easy-hard- 
easy" problem difficulty as some parameter is varied, with the peak difficulty 
occurring at the phase transition. 

If considering single 3-SAT instances one can ask for instance if a given formula 
is harder than some other one. For NP-completeness the relevant question is 
how the running time of the hardest instance (for a given n) increases with 
its size. It is important to realize that if we want to talk about perhaps more 
interesting statistical properties (e.g. scaling of the running time, phase tran- 
sition etc.) one has to specify an ensemble of 3-SAT instances, that is defining 
a measure, i.e. a probability do draw some instance. Therefore, a phase tran- 
sition phenomenon is not an inherent property of 3-SAT problem alone but of 
the measure, i.e. it is induced by the drawing procedure (for instance for uni- 
form random 3-SAT). But whereas for physical systems there exists a natural 
measure, there is no such thing for mathematical problem like 3-SAT. Physi- 
cal systems have a distinguished quantity called the energy which induces the 
canonical measure. The canonical measure depends on the temperature and 
as this parameter is varied a phase transition can occur. For 3-SAT problem 
there is no such "natural" measure. For the most frequently studied random 
3-SAT ensemble literals occur in clauses with equal probability. While this 
might seem a least biased choice there is no a priori reason why such a mea- 
sure is better than any other. As the random 3-SAT ensemble is just one of 
many possible ones it is in a way surprising than since the discovery of phase 
transition in 3-SAT most studies have been concerned with random 3-SAT 
ensemble (for earlier study of the so-called "random clause length" SAT en- 
semble see, e.g. [11]). In fact, the measure for random 3-SAT is based on the 
syntax of the particular encoding of the problem, therefore it is not directly 
related to the problem structure. It might be useful to consider measure which 
directly depends on an inherent 3-SAT property, e.g. on the number of satis- 
fying assignments. An interesting question then is how do the properties of an 
ensemble depend on the chosen measure? One of the initial ideas was [12] that 



By hard we mean that the number of steps needed by a given algorithm to solve 
the problem is larger than for most instances of the same size. 
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NP-completeness is intimately connected to the phase transition phenomenon. 
Does one therefore have an easy-hard-easy pattern also for other 3-SAT en- 
sembles? Also, what is the hardness of instances from other ensembles, are 
they harder than instances from the phase transition for random 3-SAT? 

In this work we will try to answer some of these questions. We will study an 
ensemble of random 3-SAT instances having a single satisfying assignment. By 
an empirical study of the running time of several complete and incomplete al- 
gorithms we are going to show that there is no hardness-peak for this ensemble. 
In addition, such single-solution instances also seem to be much harder than 
3-SAT instances from random 3-SAT phase transition region. We have been 
actually drawn to 3-SAT problems with one solution while studying quantum 
adiabatic algorithm for 3-SAT. Quantum adiabatic algorithm 2 attracted at- 
tention because there has been numerical evidence (for small problems) that 
the running time for random 3-SAT instances from the phase transition re- 
gion increases only quadratically with the size n [14]. Contrary to that, the 
scaling of adiabatic algorithm for instances with a single solution seems to 
be rather exponential [15]. It is therefore also interesting to compare the per- 
formance of classical algorithms for these two ensembles, particularly because 
some stochastic local search algorithms seem to be fairly efficient on random 
3-SAT problems from the phase transition. Random instances with a con- 
stant number of satisfying assignments could also improve our understanding 
of phase transition phenomenon in random 3-SAT. Let us first briefly review 
some known facts about the uniform random 3-SAT ensemble. 



2 Uniform Random 3-SAT 

A 3-CNF formula is a logical statement involving n boolean variables h{. It 
consists of m clauses Cj in conjunction (logical AND = A), C\ A C 2 A • • • A C m , 
where each clause Ci is a disjunction (logical OR = V) of 3 literals, where a 
literal is a variable 6, or its negation —ibi (logical NOT= -i). 3-SAT problem 
is to decide whether a given 3-CNF formula, denoted by 0, is satisfiable, i.e. 
whether there exists a truth assignment of variables 6j such that is true. 
Such prescription is called a solution, the number of which will be denoted by 
r. A given assignment of all variables will be called a state. An instance from 
a uniform random 3-SAT ensemble is generated by drawing three different 
random variables for each clause and negating each with probability |. The 



For random 3-SAT problems it has been established that the relevant order 
parameter for the phase transition is a ratio of the number of clauses and the 

2 For the present status of quantum algorithms see, e.g. the overview [13]. 
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number of variables, the so-called clause density a = m/n [5,8,9]. The critical 
value a c ~ 4.25 for the transition between satisfiability and unsatisfiability 
coincides with the peak in hardness, i.e. the peak in the running time of 
an algorithm see, e.g. Figure 4. Below this critical a c random problems are 
satisfiable with high probability as they are underconstrained, while above 
it they are unsatisfiable because they are overconstrained. The width of the 
transition region in parameter a has been shown to decrease as ~ 1/n 2 / 3 
with the number of variables [7] . Still, we do not have an exact expression for 
the location of the transition point. The best present proved bounds for the 
critical a are 3.42 for satisfiability border [16] and 4.506 for unsatisfiability 
border [17], therefore 3.42 < a c < 4.506. See the review by [18] for references 
about the location of the transition point. 

A very fruitful approach to 3-SAT problem is to convert it to a spin glass 
system and then use various powerful statistical methods. One can convert 
a given 3-SAT formula to a (classical) Hamiltonian by the following simple 
prescription: for each variable b{ a spin variable Si is assigned with the value 
Si = 1 corresponding to hi — 1 and Si — — 1 for hi = 0. The Hamiltonian, 
whose expectation value counts the number of unsatisfied clauses by a state, 
is a sum of terms for each clause Cj, H — YhLi Hq, where the rule for the 
Hamiltonian Hq describing the clause C, can be best seen from an example, 
(62 V ^64 V b 5 ) — ► H Ci = |(1 - S 2 )(l + S 4 )(l - S 5 ), i. e. the signs in front of 
spin variables are determined by clauses. A solution will therefore have energy 
0, and the question of satisfiability is translated into the question about the 
ground state of H with energy zero. Statistical methods have been used to es- 
timate a c and to show that the number of solutions just below the transition 
point is exponentially large [19], so the transition is reminiscent of a discontin- 
uous (1st order) phase transition in statistical physics. Analysis of the phase 
space structure also resulted in a new survey propagation algorithm [20] . The 
hardness of instances at the transition point has been connected with the dis- 
continuous occurrence of a "backbone" . A backbone is a set of variables that 
are fully constrained, i.e. have the same value in all solutions. Below a c the 
backbone is zero, while it is nonzero (and bounded away from zero) above 
a c [21]. If the backbone is large and the problem is overconstrained a back- 
tracking algorithm will quickly "realize" it made a wrong assignment. On the 
other hand if the backbone is small and the problem is underconstrained there 
are many "good" beginning assignments which will lead to the solution. 



3 Related work 

In this section we will give a list of related studies that deal with the subjects 
covered in the present paper. This includes studies of instances with a fixed 
number of solutions, scaling of the running time with n, generating methods 



4 



for hard instances and various results about the difficulty of short 3-SAT 
formulas (having small m). 

Most of the studies of random 3-SAT ensemble have been concerned with the 
computational cost at a constant n as a function of the ratio m/n, where 
the characteristic phase transition-like curve is observed. This is in a way 
surprising because for the computational complexity (and also for practical 
applications) it is the scaling of running time as the problem size increases 
which is important, i.e. changing n at fixed m/n. Exponential scaling with 
n has been numerically observed near the critical point [8,10] for random 3- 
SAT as well as above it (albeit with a smaller exponent). Recently the scaling 
with n has been studied and the transition from polynomial to exponential 
complexity has been observed below a c [22], again for random 3-SAT. 

There has been numerical evidence [5,23,24] that below a c short instances of 3- 
SAT as well as of graph coloring [25] can be hard. With respect to the formula 
size an interesting rigorous result is [26,27] that an ordered DPLL algorithm 
needs an exponential time 2 n(n / a ) to find a resolution proof of an unsatisfiable 
3-SAT instance. Note that the coefficient of the exponential growth increases 
with decreasing a, i.e. short formulas are harder. For our ensemble of single- 
solution formulas we will find the same result. 

Generating methods for 3-SAT problems having one solution have been de- 
vised employing the Latin square problem [28] as well as transforming the 
factorization to 3-SAT [29]. In both cases the parameter a of the resulting 
3-SAT instances grows with the problem size n. One can also use the con- 
version of some other NPC problem to 3-SAT [30]. Hard instances in the 
underconstrained region can be generated by embedding a smaller unsatisfi- 
able subproblem [31] into a larger instance. Ferromagnetic phase transition in 
a spin glass has been exploited to generate hard satisfiable 3-SAT instances in 
the overconstrained region, m/n > a c [32]. Hard instances can also be gener- 
ated by hiding satisfying assignments [33]. This work has been extended [34] 
to produce even harder instances, particularly for stochastic local search meth- 
ods. For instance, the number of necessary Walksat steps grows exponentially 
as oc 2 01n for the hardest instances generated. 

As regards the connection between the number of solutions and the formula 
difficulty there have been several works, but none studied in detail how the 
time scales if the number of solutions is held fixed. In [35] it has been found 
that for constraint satisfaction problem the difficulty monotonically increases 
by decreasing the phase transition parameter. Later it was found [36] that 
the existence of the peak in hardness can sensitively depend in the ensemble 
and the algorithm used. Hoos [37] found a correlation between the number of 
solutions and the problem difficulty, i.e. instances with less solutions tend to 
be harder for stochastic local search methods, see also [38]. Problems having 
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a small backbone seem to have stronger correlation between the number of 
solutions and a local search difficulty [39]. 



4 Random 3-SAT instances with one solution 

Although the precise understanding of the phase transition phenomenon in 
random 3-SAT is still lacking a frequent heuristic explanation for its occur- 
rence (see, e.g. [40]) is the combination of a decreasing number of solutions 
as a is increased and at the same time increased pruning of the search three, 
resulting in the maximal complexity for some intermediate a. This "expla- 
nation" classifies 3-SAT instances according to the number of solutions they 
have. Therefore, an ensemble of random 3-SAT instances having a constant 
number of solutions would presumably tell us also something about the phase 
transition itself. As a second motivation point to choose a single-solution en- 
semble is the fact that for some applications [29,28] a single-solution 3-SAT 
instances might be more "natural" choice than random 3-SAT. 
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Fig. 1. Frequency of 3-SAT formulas with exactly r = 1 solution among random 
instances at m/n = 3, i.e. the inverse probability to get such a formula. Each point 
is an average over 100 instances. 

Our ensemble therefore consists of random 3-SAT instances that have a single 
satisfying assignment (r = 1). As the stress of this paper is to identify a new 
interesting ensemble of 3-SAT problems and not primarily to generate very 
large problems, we used a rather inefficient method to generate this ensemble. 
We obtained random 3-SAT instances with r = 1 solution by simply filter- 
ing randomly generated formulas trough a 3-SAT solver keeping only those 
with r = 1. The method is very inefficient because the expected number of 
solutions (below a c ) grows exponentially with n. Thereby, the probability to 
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find an instance with only one solution among random 3-SAT ensemble de- 
cays exponentially with n. This can be seen in Figure 1. This rather technical 
issue of inefficient generation limited us to problems of relatively small size 
n < 40. For instance, to generate 1000 random 3-SAT instances with n = 40, 
r = 1 and m/n = 3 we had to solve approximately 10 10 randomly gener- 
ated problems. Still, we think that it is conceivable to generate single-solution 
problems with constant a in a more efficient way, e.g. by converting one of 
the many known NPC problems [41] to 3-SAT with one solution. The proba- 
bility to find single-solution formulas among random 3-SAT instances changes 
with m/n and attains its maximum at the location of the transition point for 
random 3-SAT, see Figure 2. 
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Fig. 2. The dependence of probability p(r = 1) to find single-solution instance 
among random 3-SAT ensemble for three different n. 



4-1 Algorithms used 

We will test several algorithms for solving 3-SAT problem. Generally there 
are two classes of algorithms : (i) a complete ones that determine satisfiability 
or unsatisfiability of a given formula. They terminate after a finite number of 
steps, either by finding a solution or proving unsatisfiability. (ii) incomplete 
algorithms which can only find a solution but can not prove unsatisfiability. 
In principle they have no terminating condition and can therefore be used 
only on satisfiable formulas on which though they can perform better than 
complete algorithms. Typically they employ some sort of a localized random 
search. 

The most popular complete method is the DPLL algorithm [42], sometimes 
called also just DP algorithm because it is based on an earlier algorithm by 
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Davis and Puttnam [43]. DPLL is a backtracking depth-first algorithm. It as- 
signs truth values to variables and simplifies the formula. The simplification 
of formula when assigning TRUE value to a single literal v consists of delet- 
ing clauses that are satisfied by the truth assignment, i.e. contain literal v, 
and deleting all literals contradicting the assignment in other clauses, i.e. all 
occurrences of ->v. The algorithm therefore descends along the state tree by 
recursive calls until it either finds a solution or encounters a contradiction, i.e. 
an empty clause occurs. In the later case it backtracks by changing a previ- 
ously made assignment. The number of recursive calls of DPLL procedure is a 
good measure of running time. There are different variants of DPLL algorithm 
depending on the variable-selection rule, i.e. on the heuristics how we choose 
the next variable whose value we assign. In the algorithm we use we pick the 
first variable in the first unsatisfied clause. To illustrate DPLL algorithm we 



Fig. 3. Search tree of the DPLL algorithm for 3-SAT instance with n = 26 and 
m/n = 3 having one solution, r = 1. Color of vertices and numbers denote the 
number of assigned variables. 

can plot a search tree as shown in Figure 3 for an instance with n = 26 vari- 
ables and m/n = 3, having exactly one solution, r = 1. Each vertex denotes 
a state with a certain number of assigned variables, the number of which is 
printed next to a vertex. The algorithm therefore starts from the white vertex 
with the number and ends at the black vertex with the number 26, denot- 
ing a state which is a solution of 3-SAT formula. The number of DPLL calls 
needed to find the solution, equal to the number of vertices, was 278 in this 
case. The search tree has been plotted using the network analysis program 






Pajek" [44]. 
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Nowadays there are of course many modern algorithms that are much faster 
than the original DPLL one although most of them is still based on DPLL 
idea. Each year a competition for SAT solvers is organized comparing their 
performance on artificial and industrial SAT problems [45]. In addition to sim- 
ple DPLL we have tested also one of those, namely SATO 3 by H. Zhang [46]. 
SATO is based on DPLL but uses different splitting rule and also uses "intelli- 
gent backjumping", meaning that it must not backtrack step by step but can 
jump over several steps. In all numerical experiments the results for SATO 
were qualitatively the same as for DPLL. 

For satisfiable problems the so-called stochastic local search algorithms can 
be more effective than complete algorithms. We have used GWSAT [47] as 
our main stochastic algorithm. It is based on GSAT [6,48], and is one of the 
most widely studied incomplete methods. At the beginning of the algorithm 
we randomly draw a state, i.e. choose a random assignment of variables. Then 
at each step we change the truth value of one variable (such a step is also 
called a flip). For local search methods we need a cost function that will 
measure how good different flips are, so that we can choose the best one at 
each step. In GWSAT we choose with probability l—p the variable to flip as 
the one which leads to the state with the largest number of satisfied clauses 
(GSAT step), and with probability p a random variable from a randomly 
chosen unsatisfied clause. This is repeated until a solution is found. A good 
measure of running time is the number of flips made until a solution is found. In 
addition to GWSAT we also tested Walksat [47] and Adaptive Novelty+ [49]. 
For all stochastic local search algorithm implementations we used the Ubcsat 
program [50]. Again, as for complete methods, the results for more advanced 
Walksat and Adaptive Novelty+ algorithms were qualitatively the same as for 
GWSAT. For a detailed comparison of different local search algorithms see, 
e.g. [51] 

In the next two subsections we will present the main results of the paper, 
the analysis of the running time of various 3-SAT solving algorithms for an 
ensemble of random 3-SAT instances with 1 solution. First we will show the 
dependence of running time on a at constant n in order to demonstrate that 
there is no phase transition-like peak in the difficulty. 



4-2 Running time at constant n 

The data for DPLL are shown in Figure 4. In addition to the curve for random 
instances with r = 1 we also plot one for random 3-SAT ensemble (arbitrary 
r) and for random instances with r > 1 solutions. We can clearly see that 



3 We used SATO v. 4.1. 
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the running time of instances with exactly one solution increases with de- 
creasing m/n and gets in fact larger for sufficiently small clause density than 
for instances around the phase transition point for random 3-SAT. The same 
quantitative results are obtained also for SATO. 




m/n 



Fig. 4. Running time for DPLL algorithm for random 3-SAT with r = 1 solution 
(empty triangles), r > 1 solution (full squares) and arbitrary r (empty circles). In 
the inset is shown the maximal running time (same three data sets) out of 1000 
instances used for each point. All is for n = 30. 
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Fig. 5. Median running time for GWSAT algorithm for random 3-SAT with r = 1 
solution (empty triangles) and r > 1 solution (full squares) . In the inset is shown the 
maximal running time (same data sets) out of 1000 instances used for each point. 
All is for n = 30. 

In Figure 5 we show the results for GWSAT algorithm. Again one can observe 
that the running time is larger for single-solution ensemble and that there is 
no peak in the difficulty. Similar figure is obtained also for Walksat and Adap- 
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tive Novelty+. Therefore, one can see that for an ensemble of single-solution 
random 3-SAT the difficulty monotonically increases with decreasing m/n. In 
comparison to random 3-SAT ensemble there is no peak in the difficulty for 
our ensemble. In view of that we will in the next section study how the dif- 
ficulty scales with n for constant m/n. Because instances with smaller m are 
harder, we will choose m/n = 3. Note that by choosing even smaller m/n one 
will get even harder instances, but then our inefficient generating method gets 
too slow. 



4-3 Running time at constant m/n 



Even though one can see in Figures 4 and 5 than instances from single-solution 
ensemble at m/n = 3 are harder than the phase transition ones from random 
3-SAT ensemble for the shown n = 30, their difficulty could scale differently 
with n. The real question then is, what happens when we increase n. We 
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Fig. 6. Scaling of the running time for DPLL algorithm. Triangles are for r = 1 at 
m/n = 3 and circles for arbitrary r at m/n = 4.5. 

will always compare results of a single-solution ensemble at m/n = 3 and 
random 3-SAT ensemble at m/n = 4.5 which is the approximate location of 
the transition point for random 3-SAT and small n studied here. Each time 
we will average over 1000 3-SAT instances. In Figure 6 for DPLL algorithm 
one can see that the running time for single-solution instances increases faster 
than for random instances from the phase transition point. The difference in 
hardness therefore increases with increasing size. 

Similar results can be observed also for GWSAT in Figure 7. For stochastic 
search methods we average again over 1000 instances and for each instance 
over 100 runs of the algorithm. The difference again increases with increasing 
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Fig. 7. Scaling of the running time for GWSAT algorithm. Empty triangles are for 
r = 1 at m/n = 3 and full squares for r > 1 at m/n = 4.5. 

n, this time even faster as for DPLL. In Figure 8 we show similar result also for 
Adaptive Novelty+. The number of necessary flips is smaller as for GWSAT 
but the overall behavior is qualitatively the same. Although the range of n 
is relatively small we also plotted the best fitting exponential or power-law 
dependences. While for random 3-SAT instances from the phase transition one 
has a slow polynomial oc n 2 growth of the running time, the increase is much 
faster, agreeing well with an exponential oc 2 018n , for single-solution ensemble. 
The coefficient of exponential growth 0.18 is actually fairly large, meaning that 
short single-solution random instances get harder very quickly. For instance, 
for hard soluble 3-SAT instances reported in [34] the increase of the number of 
flips for Walksat algorithm was approximately cx 2 01n (with a large pre-factor). 
For our single-solution ensemble Walksat algorithm is slightly slower than 
Adaptive Novelty+. Interestingly, the same difference between the complexity 
scaling of two ensembles as here, namely polynomial vs. exponential, has been 
found also for quantum adiabatic algorithm [14,15]. It might well be that the 
instances from the phase transition region of random 3-SAT ensemble are not 
that difficult and a polynomial average cost algorithm is possible. 

What is the explanation for the difficulty of single-solution random 3-SAT 
instances with small m/n? We will give some heuristic arguments why it might 
not be so unexpected that such 3-SAT instances are hard to solve. 

For single-solution instances it turns out [15] that the number of assignments 
that violate only one clause, called the excited states, is very large. In fact the 
number of such excited states grows exponentially with n. Therefore, 3-SAT 
instances from single-solution random ensemble have only one solution and at 
the same time exponentially many assignments violating only one (or a few) 
clause. Any complete method, like DPLL for instance, must do exhaustive 
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Fig. 8. Scaling of the running time for Adaptive Novelty+. Empty triangles are for 
r = 1 at m/ra = 3 and full squares for r > 1 at m/n = 4.5. Dashed curves are 
11 • 2°- 18n (top) and 0.09 -n 20 (lower). In the lower plot we show the maximal times. 



search in the state tree until a contradiction is encountered and the algorithm 
has to backtrack. If assignments made are such that we are descending to- 
wards an excited state which violates only one clause, a contradiction will 
occur only after we make an assignment of all three variables occurring in 
this clause. This can possibly occur very deep within the tree, causing large 
amounts of backtracking. Simply said, the excited states "fake" the algorithm 
into descending along the wrong branches. This can be seen in Figure 9. This 
time the numbers next to vertices denote the number of excited states in the 
sub-tree below the vertex. We can see, that long branches are usually corre- 
lated with a large number of excited states. As a consequence, the search tree 
is large with many long branches. One can argue also differently: assuming 
a random truth assignment of the first assigned variable in a DPLL-like al- 
gorithm, one will with probability | end up with an unsatisfiable problem. 
For this unsatisfiable problem a rigorous statement [52] about the exponential 
complexity of DPLL even below the satisfiability border a c suggest an expo- 
nential complexity. Similarly, the running time is expected to be exponential 
also for incomplete stochastic local search methods. For such algorithms the 
exponential number of excited states will effectively shadow out the single real 
solution (searching for a needle in a haystack). To circumvent the exponen- 
tial complexity the algorithm would have to efficiently distinguish between 
an exponentially many states violating only one clause and a single solution 
satisfying all clauses. This can not of course be excluded for some yet to be 
found smart choice of moves or a smart variable-selection rule in a DPLL-like 
algorithm, but it seems unlikely because there is simply very few information 
available which the algorithm could use in its heuristics. Remember that we 
are concentrating on underconstrained instances having as few clauses as pos- 
sible. Of course, the explanation with excited states is probably only part of 
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the story. It would be interesting to investigate the phase space structure of 
such instances in greater detail. 




Fig. 9. Search tree for DPLL and 3-SAT instance with n = 26 and m/n = 3 (the 
same instance as in Figure 3). Numbers next to vertices denote the number of 
excited states in the sub-tree below a given vertex. The color of vertices denotes the 
depth in the tree as before. 



5 Conclusions 

We have identified and studied a new ensemble of 3-SAT instances, namely 
random 3-SAT formulae having a single satisfying assignment. Numerical ex- 
periments show that the properties of this ensemble are significantly different 
from those of random 3-SAT ensemble. The difficulty of single-solution in- 
stances monotonically increases with decreasing clause density, that is shorter 
formulas are generally harder. Therefore, this ensemble does not exhibit easy- 
hard-easy pattern of difficulty. Short single-solution instances (having e.g. 
m/n = 3) are in fact much harder than problems from the phase transition 
region of random 3-SAT ensemble. It would be interesting to investigate the 
nature of their hardness more in detail. 3-SAT instances can be divided into 
ensembles according to the number of satisfying assignments they have. Here 
we studied only the ensemble having one satisfying assignment. An interesting 
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question is, is the behavior of other ensembles similar, for instance that the 
difficulty of instances decays monotonically with m/nl If yes, the occurrence 
of the maximal difficulty for random 3-SAT at the phase transition can be 
viewed as being due to the changing of the probability to draw instances with 
fixed number of solutions. For single- solution instances studied in the present 
paper, the highest probability to find them among random 3-SAT instances 
occurs at the transition point and decays fast away from it. Becouse this de- 
cay is faster than the increase of their difficulty for small m, a maximum of 
difficulty occurs at the location of the maximum probability. 

The author would like to thank the Alexander von Humboldt Foundation for 
its support. 
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